Experimental demonstration of Shor's algorithm with quantum entanglement 
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Shor's powerful quantum algorithm for factoring represents a major challenge in quantum compu- 
tation and its full realization will have a large impact on modern cryptography. Here we implement 
a compiled version of Shor's algorithm in a photonic system using single photons and employing 
the non-linearity induced by measurement. For the first time we demonstrate the core processes, 
coherent control, and resultant entangled states that are required in a full-scale implementation of 
Shor's algorithm. Demonstration of these processes is a necessary step on the path towards a full 
implementation of Shor's algorithm and scalable quantum computing. Our results highlight that 
the performance of a quantum algorithm is not the same as performance of the underlying quantum 
circuit, and stress the importance of developing techniques for characterising quantum algorithms. 
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As computing technology rapidly approaches the nano- 
scale, fundamental quantum effects threaten to introduce 
an inherent and unavoidable source of noise. An alterna- 
tive approach embraces quantum effects for computation. 
Algorithms based on quantum mechanics allow tasks im- 
possible with current computers, notably an exponential 
speed-up in solving problems such as the factoring prob- 
lem Many current cryptographic protocols rely on 
the computational difficulty of finding the prime factors 
of a large number: a small increase in the size of the num- 
ber leads to an exponential increase in computational re- 
sources. Shor's quantum algorithm for factoring compos- 
ite numbers faces no such limitation, and its realization 
represents a major challenge in quantum computation. 

To date, there have been demonstrations of entangling 
quantum-logic gates in a range of physical architectures, 
ranging from trapped ions 
cuits |j] , to single photons 
ton polarisation experiences essentially zero decoherence 
in free space; uniquely, photonic gates have been fully 
characterised @], produced the highest entanglement 
and are the fastest of any architecture . The combina- 
tion of long decoherence time and fast gate speeds make 
photonic architectures a promising approach for quantum 
computation, where large numbers of gates will need to 
be executed within the coherence lifetime of the qubits. 

Shor's algorithm can factor a fc-bit number using 72k^ 
elementary quantum gates, e.g. factoring the smallest 
meaningful number, 15, requires 4608 gates operating on 
21 qubits [l^l- This is well beyond the reach of current 
technology. Recognizing this, Ref. [l3l introduced a com- 
piling technique which exploits properties of the number 
to be factored, allowing exploration of Shor's algorithm 
with a vastly reduced number of resources. Although 
the implementation of these compiled algorithms do not 
directly imply scalability, they do allow the character- 
isation of core processes required in a full-scale imple- 
mentation of Shor's algorithm. Demonstration of these 
processes is a necessary step on the path towards scal- 
able quantum computing. These processes include the 
ability to generate entanglement between qubits by co- 
herent application of a series of quantum gates: this rep- 



resents a significant challenge with current technology. 
In the only demonstration to date, a compiled set of gate 
operations were implemented in a liquid NMR architec- 
ture However, since the qubits are at all times in 
a highly mixed state lla| , and the dynamics can be fully 
modelled classically [l6|, neither the entanglement nor 
the coherent control at the core of Shor's algorithm can 
be implemented or verified. 

Here we implement a compiled version of Shor's algo- 
rithm, using photonic quantum-logic gates to realise the 
necessary processes, and verify the resulting entangle- 
ment via quantum state and process tomography 17l.ll8l|. 
We use a linear-optical architecture where the required 
nonlinearity is induced by measurement; current exper- 
iments are not scalable, but there are clear paths to a 
fully scalable quantum architecture [l^, [1^ . Our gates 
do not require pre-existing entanglement and we encode 
our qubits into the polarisation of up to four photons. 
Our results highlight that the performance of a quantum 
algorithm is not the same as performance of the underly- 
ing quantum circuit, and stress the importance of devel- 
oping techniques for characterising quantum algorithms. 

Only one step of Shor's algorithm to find the factors 
of a number N requires a quantum routine. Given a ran- 
domly chosen co-prime C (where l<C<iV and the great- 
est common divisor of C and iV is 1), a quantum routine 
finds the order of C modulo N , defined to be the min- 
imum integer r that satisfies the function CmodiV=l. 
It is straightforward to find the factors from the order. 
Consider 7V=15: if we choose C=2, the quantum routine 
finds r=4, and the prime factors are given by the non- 
trivial greatest common divisor of C""/^±l and iV, i.e. 3 
and 5; similarly if we choose the next possible co-prime, 
(7=4, we find the order r=2, yielding the same factors. 

Fig. [l}i) shows a conceptual circuit of the quan- 
tum order-finding routine. It consists of three distinct 
steps: i) register imtialisation, |0)'^"|0)'^™^(|0) + |1))®" 

\Q)®'^-^\1)=YZJo k)|0)®"-^|l), where the argument- 
register is prepared in an equal coherent superposition 
of all possible arguments (normalisation omitted by con- 
vention) ; ii) modular exponentiation, which by controlled 
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application of the order-finding function produces the 
entangled state J21Jq^ |a;)|C"^mod7V); iii) the inverse 
Quantum Fourier Transform (QFT) followed by mea- 
surement of the argument-register in the logical basis, 
which with high probability extracts the order r after fur- 
ther classical processing. If the routine is standalone, the 
inverse QFT can be performed using an app roach based 
on local measurement and feedforward [2l|. Note that 
the inverse QFT in was unnecessary: it is straight- 
forward to show this is true for any order-2' circuit [23 • 
Modular exponentiation is the most computationally- 
intensive part of the algorithm [Tst . It can be realised by 
a cascade of controlled unitary operations, U, as shown 
in the nested inset of Fig. la). It is clear that the reg- 
isters become highly entangled with each other: since 
U is a function of C and iV, the entangling operation is 
unique to each problem. Here we choose to factor 15 with 
the first two co-primes, C=2 and C=4. In these cases en- 
tire sets of gates are redundant: specifically, =1 when 
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FIG. 1: a) Conceptual circuit for the order-findine routine of 
Shor's algorithm for number A'^ and co-prime C [l^l- The ar- 
gument and function registers are bundles of n and m qubits; 
the nested order-finding structure uses C/|y) = |Cy modA'^), 
where the initial function-register state is \y)=l. The algo- 
rithm is completed by logical measurement of the argument- 
register, and reversing the order of the argument qubits. b),c) 
Implementation of a) for A'^=15 and C=4, 2, respectively; the 
unitaries are decomposed into controlled- SWAP gates (CSWAP), 
marked as x; controlled-phase gates are marked by dots; H and 
T represent Hadamard and n/8 gates. Many gates are redun- 
dant, e.g. the second gate in b), the first and second gates in 
c). d),e) Partially-compiled circuits of b),c), replacing CSWAP 
by controlled-NOT gates, n.b. e) is equivalent to the A''=15 
C=7 circuit in Ref . [141] . f),g) Fully-compiled circuits of d),e), 
by evaluating log(^[C^modN] in the function-register. 



n>0 for C=4, and C/2"=/ when n>l for C=2. Figs lb),c) 
show the remaining gates for C=4 and C=2, respectively, 
after decomposition of the unitaries into controUed-SWAP 
gates — this level of compiling is equivalent to that in- 
troduced in Ref. [l3|- Further compilation can always 
be made since the initial state of the function-register 
is fixed, allowing the CSWAP gates to be replace d by 
controlled-NOT (cnot) gates as shown in Figs ld),e) [23j . 

We implemented the order-2-finding circuit. Fig. Id). 
The qubits are realised with simultaneous forward and 
backward production of photon pairs from parametric 
downconversion. Fig. [2^) : the logical states are encoded 
into the vertical and horizontal polarisations. This circuit 
required implementing a recently-proposed three-qubit 
quantum- logic gate. Fig. [^b), which realises a cascade of 
n controUed-z gates with exponentially greater success 
than chaining n individual gates The controlled- 

NOT gates are realised by combining Hadamards and 
controlled-z gates based on partially-polarising beam- 
splitters. The gates are nondcterministic, with one third 
success probability when fully prcbiased [^, 12, [l^ . A run 
of each routine is fiagged by a fourfold event, where a 
single photon arrives at each output. Dependent pho- 
tons from the forward pass interfere non-classically at 
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FIG. 2: Experimental schematic, a) Forward and backward 
photons pairs are produced via parametric downconversion 
(PDC) of a frequency-doubled mode-locked Ti:Sapphire laser 
(820 nm^llO nm, At=80 fs at 82 MHz repetition rate) 
through a Type-I 2 mm Bismuth Borate (BiBgOs) crystal. 
Photons are input to the circuits via blocked interference 
filters (820±3 nm) and single-mode optical fibres, and de- 
tected using single photon counting modules, (PerkinElmer 
AQR-14FC). Coincidences are measured using a quad- logic 
card driven by a four-channel constant fraction discrimina- 
tor. With 500 mW at 410 nm this yielded 60 kHz and 25 kHz 
twofold coincidence rates for direct detection, which difi'ered 
due to mismatched pump focus sizes; the measured fourfold 
coincidence rate was 35 Hz. b),c) Linear optical circuits for 
order-2 and order-4 finding algorithms, with inputs from a) 
labelled; the letters on the detectors refer to the Fig. 1 qubits. 
d),e) Physical optical circuits for b),c), replacing the classical 
interferometers with partially-polarising beamsplitters. 
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the first partial polariser, Fig. [2ji), one photon then in- 
terferes with an independent photon from the backward 
pass at the second partial polariser. We measured rela- 
tive nonclassical visibihties, K=Vmcas/Mdcab of 98±2% 
and 85±6%. 

Directly encoding the order-4 finding circuit, Fig. le), 
requires six photons and at least one three-qubit and five 
two-qubit gates. This is currently infeasible: the best six- 
photon rate to date is 30 mHz, which would be re- 
duced by six orders of magnitude using non-deterministic 
gates. To explore an order-4 routine, and the differ- 
ent processes therein, further compilation is necessary. 
In particular, we can compile circuits ld),e) by evaluat- 
ing logQ [C^modiV] in the function-register in place of 
C^modA^. This requires log2[log(^[iV]] function qubits, 
as opposed to log2[-/V], i.e. for N=15, C=2, the function- 
register reduces from 4 to 2 qubits. Note that this full 
compilation maintains all the features of the algorithm 
as originally proposed in Ref . • Thus the order-4 cir- 
cuit, Fig.[Tj3), reduces to a pair of CNOTs, allowing us to 
implement the circuit in Fig , [lb). We use a pair of com- 
pact optical gates d, 0, Fig[2t),e), each operating 
on a dependent pair of photons, resulting in measured 
visibilities for both of Vr=98±2%. 

Fig. [3] shows the measured density matrices of the 
argument-register output for both algorithms, sans 
the redundant top-rail qubit [i^. Ideally these are 



maximally- mixed states [22i : in all cases we measure 



near-unity fidelities [27, 28|. The output of the routines 
are the logical state probabilities, i.e. the diagonal ele- 
ments of the matrices. Combining these with the known 
state of the redundant qubit, and reversing the argument 
qubits as required, gives the binary outputs of the algo- 
rithm which after classical processing yields the prime 
factors of N. In the ordcr-2 circuits the binary outputs 
of the algorithm arc 00 or 10: the former represents the 
expected failure mode of this circuit, the latter a suc- 
cessful determination of r=2; failure and success should 
have equal probabilities, we measure them to be 50% to 
within error. Thus half the time the algorithm yields 
r=2, which gives the factors, 3 and 5. In the order-4 cir- 
cuit the binary outputs are 000, 010, 100 and 110: the 
(a) (b) ... 
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FIG. 3: Algorithm outputs given by measured argument- 
register density matrices. The diagonal elements are the log- 
ical output probabilities, a) Order-2 algorithm. The fidelity 
with the ideal state is F=99.9±0.3%, the linear entropy is 
Sl = 100±1% [11]. Combined with the redundant qubit the 
logical probabilities are {Poo, -Pio}={52, 48}±3%. b) Order- 
4 algorithm, P=98.5±0.6%, and &=98.1±0.8%. The logical 
probabilities are {Pooo, Poio, Pioo, Piio}={27, 23, 24, 27}±2%. 
Real parts shown, imaginary parts are less than 0.6%. 



second and fourth terms yield the order-4 result, the first 
is a failure mode and the third yields trivial factors. We 
measure output probabilities of 25% to within error, as 
expected. After classical processing half the time the al- 
gorithm finds r=4, again yielding the factors 3 and 5. 

These results show that we have near-ideal algorithm 
performance, far better than we have any right to ex- 
pe ct g iven the known errors inherent in the logic gates 
[8|, |29| . This highlights that the algorithm performance 
is not always an accurate indicator of circuit perfor- 
mance since the algorithm produces mixed states. In 
the absence of the gates the argument-register qubits 
would remain pure; as they are mixed they have become 
entangled to something outside the argument-register. 
From algorithm performance we cannot distinguish be- 
tween desired mixture arising from entanglement with 
the function-register, and undesired mixture due to en- 
vironmental decoherence. Circuit performance is crucial 
if it is to be incorporated as a sub-routine in a larger 
algorithm, Fig. la), e), and g). The joint state of both 
registers after modular exponentiation indicates circuit 
performance; we find entangled states that partially over- 
lap with the expected states. Fig. IH indicating some en- 
vironmental decoherence. The fidelity of the four-qubit 
state with the ideal. Fig. [Ih), is higher than that of the 
three-qubit state. Fig. 2^), chiefiy because the latter re- 
quires nonclassical interference of photons produced by 
independent sources, which suffer higher distinguishabil- 
ity, lowering gate performance 2^, 30, 31 1. 




FIG. 4: Measured density matrices of the state of both 
registers after modular exponentiation, a) Order-2 circuit. 
Ideal state is locally equivalent to a GHZ state: we find 
-Fghz=59±4%. The state is partially-mixed, S'l=62%±4%, 
and entangled, violating the optimal GHZ entanglement wit- 
ness VyGHZ = l/2-PGHZ=-9±4% [H. b) Order-4 circuit. 
Measured fidelity with the ideal state, a tensor product of 
two Bell-states, is -F=68±3%. The state is partially-mixed, 
5'l=52±4%, and entangled, with tangles of the component 
Bell-States of 41±5% and 33±5%. Real parts shown, imagi- 
nary parts are respectively less than 7% and 4%. 
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Process tomography fully characterises circuit perfor- 
mance, yielding the ^-matrix, a table of process measure- 
ment outcomes and the coherences between them. Mea- 
sured and ideal ^-matrices can be quantitatively com- 
pared using the fidelity @, [11] ; we measured process fi- 
delities of Fp=85%, 89% for the two-qubit gates of the 
order-4 circuit. It is the easier of the two algorithms to 
characterise since it consists of two gates acting on in- 
dependent qubit pairs. Consequently, by assuming that 
only these gates induce error, the order-4 circuit process 
fidelity is simply the product of the individual gate fi- 
delities [si], Fl;'"'''=F^''F^''=80%. Clearly this is signifi- 
cantly less than the algorithm success rate of 99.7%. The 
order-2 circuit is harder to characterise, requiring at least 
4096 measurements, infeasible with our count rates. De- 
composing the three-qubit gate into a pair of two-qubit 
gates yields process fidelities Fp=78%, 90% (again reflect- 
ing differing interferences of independent and dependent 
photons). There is no simple relation between individual 
CZ gate performances, and that of the three-qubit gate. 
However, a bound can be obtained by chaining the gate 
errors, Fp >20% [33|. This is not a useful bound, c.f. the 
fidelity between an ideal CZ and doing nothing at all of 
Fp=25%! (The bound only becomes practical as Fp^l). 
For larger circuits, full tomographic characterisation be- 
comes exponentially impractical. The order-finding rou- 
tine registers contain k=n-\-m qubits: state and process 
tomography of a fc-qubit system require at least 2^^ and 
2'*'^ measurements, respectively. 

An alternative is to gauge circuit performance via the 
logical correlations between the registers. Modular ex- 
ponentiation produces the entangled state Xl^=o^ l^^)!?/)) 
where y is respectively C^modA^ and log[C^modiV] for 
partial and full compilation. For a correctly functioning 
circuit, measuring the argument in the state x projects 
the function into y — requiring at most 2*^ measurements 
to check. The results in Fig.[5]show there is a clear corre- 
lation between the argument and function registers, 59 to 
83% and 67 to 87% for the order-2 and order-4 circuits, 
respectively. Again, these indicative values of circuit op- 
eration are significantly less than the algorithm success 
rates. 

We have experimentally implemented every stage of a 
small-scale quantum algorithm. Our experiments demon- 
strate the feasibility of executing complex, multiple-gate 
quantum circuits involving coherent multi-qubit super- 
positions of data registers. We present two different im- 
plementations of the order-finding routine at the heart of 
Shor's algorithm, characterising the algorithmic and cir- 
cuit performances. Order- finding routines arc a specific 
case of phase-estimation routines, which in turn under- 
pin a wide variety of quantum algorithms, such as those 
in quantum chemistry [s^l • Besides providing a proof of 
the use of quantum entanglement for arithmetic calcula- 
tions, this work points to a number of interesting avenues 
for future research — in particular, the advantages of tai- 



(a) 



1 

0.5 





00 01 




■■■■ 00 01 ioVf 
5: Measured function-register 



M11' 

FIG. 5: Measured function-register probabilities af- 
ter modular exponentiation, conditioned on logical mea- 
surement of the argument-register Mx- There is a 
high correlation between the registers: a) Order-2 cir- 
cuit, {P,n,Pio}={83±4%,59±5%}; b) Order-4 circuit, 
{Poo, -Poi, Pio, Pii}={87±3%, 84±4%, 82±5%, 67±6%}. 

loring algorithm design to specific physical architectures, 
and the urgent need for efficient diagnostic methods of 
large quantum information circuits. 
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Additional Online Material. For all the circuits 
Fig. lb)-g), the consecutive Hadamards in the top qubit 
of the argument-register cancel each other out (since 
H^=i): consequently both this qubit, and the gate(s) 
controlled by it, are redundant and need not be im- 
plemented experimentally. The remaining argument- 
register qubits are maximally-entangled to the function- 
register. Since the function-register output is not mea- 
sured, these argument qubits are maximally- mixed, and 
the subsequent gates in the inverse QFT are therefore 
also redundant. Thus the inverse QFT in Ref. [14] was 
unnecessary: indeed, it is straightforward to show this 
is true for any order-2' circuit. After modular expo- 
nentiation, the circuit state is Y^x=o^ |a;)|C"^modA^): for 
any two values x and y that differ by an integer, k 
number of orders, i.e. y—x~k2'', C^modA^=C^modiV, 



and the state after modular exponentiation becomes 
T,T=o~^ T,t=o \k'2^+0')\C''uiodN). Note that the first 
n—l qubits of the argument-register (top to bottom) en- 
code the number k, the remaining I qubits encode 2' dis- 
tinct values of a: we divide the argument-register ac- 
cordingly, X^fc a |fc)|a)|C"'). The |fc) qubits do not be- 
come entangled to the function-register whereas the \a) 
qubits are maximally- entangled to it — consequently after 
tracing out the function-register, the \a) qubits are in a 
maximally-mixed state and any further gates acting on 
them are redundant. Application of Hadamard gates in 
the inverse QFT reset the |fc) qubits to 0, inhibiting any 
gates controlled by them. The final step of the inverse 
QFT is to swap the first and last qubits of the argument 
register which can be done after measurement. Thus the 
inverse QFT can be omitted in all cases r=2'. 



